Chapter 8 - Comparison of lists and sets

You've been introduced to two containers in this topic: lists and sets. However, a question we often get is when to use a list and when a set. The goal of this chapter is to help you answer that question.

At the end of this chapter, you will be able to:

  • decide when to use a list and when to use a set

If you have questions about this chapter, please refer to the forum on Canvas.

1. Properties of sets and lists

Sets: unordered collection of unique elements

Lists: ordered collection of elements

Comparison lists vs sets

property set list
can contain duplicates no yes
ordered no yes
finding element(s) relatively quick relatively slow
can contain immutable objects all objects
elements can be added/removed (mutable) yes yes

1.1 Duplication of elements

  • list: yes
  • set: no

As shown below, lists allow duplicates (e.g. the integer 1 in the example below), sets do not.


In [ ]:
list1 = [1, 2, 1, 3, 4, 1]
set1 = {1, 2, 3, 4}
set2 = {1, 2, 1, 3, 4, 1}

In [ ]:
print('list1', list1)
print('set1', set1)
print('set2', set2)
print('set1 is the same as set2:', set1 == set2)

Tip

You can create a set from a list. Attention: duplicates will be removed.


In [ ]:
a_list = [1,2,3,4, 4]

a_set = set(a_list)

print(a_list)
print(a_set)

1.2 Order (with respect to how elements are added to it)

  • list: yes
  • set: no

The order in which you add elements to a list matters. Please look at the following example:


In [ ]:
a_list = []
a_list.append(2)
a_list.append(1)
print(a_list)

However, this information is not kept in sets:


In [ ]:
a_set = set()
a_set.add(2)
a_set.add(1)
print(a_set)

Is it possible to understand the order of items in a set? Yes, but we will not cover it here since it is not important for the tasks we treat.

What is then the take home message about order? The answer is: you have it for lists, but not for sets.

If you want to learn more about this, look up the data structure called hash table (https://en.wikipedia.org/wiki/Hash_table)

1.3 Finding element(s)

It's usually quicker to check if an element is in a set than to check if it is in a list.

Hence, this will be usally relatively slow:


In [ ]:
list1 = [1,2,3,4]
print(1 in list1)

And this will usually be relatively quick:


In [ ]:
set1 = {1,2,3,4}
print(1 in set1)

Is it possible to understand the speed of finding elements of items in sets and lists? Yes, but we will not cover it here since it is not important for the tasks we treat.

What is then the take home message about speed? The answer is: it's probably quicker to use sets

1.4 Mutability of elements in can contain

sets can only contain immutable objects. This works:


In [ ]:
a_set = set()
a_set.add(1)
print(a_set)

This does not


In [ ]:
a_set.add([1])

lists can contain any Python object. This works:


In [ ]:
a_list = []
a_list.append(1)
print(a_list)

This as well


In [ ]:
a_list = []
a_list.append([1])
print(a_list)

2. When to choose what?

Lists if you need:

  1. duplicates
  2. the order in which items are added
  3. mutable objects

All other scenarios -> sets

Exercises

Exercise 1:

Which container can contain duplicates?


In [ ]:

Exercise 2:

Which container is the faster choice when checking whether it contains an element?


In [ ]:

Exercise 3:

You want to collect and count all the people taking this class. You can only use their first names. Do you chose a list or a set?


In [ ]:

Exercise 4:

Can you think of a use case for a set and a list (perhaps you think of text analysis)?


In [ ]: